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differentiation, or repair of these and other tissues. Frzb-1 is a soluble antagonist of growth factors of the Wnt family that acts by binding 
to Wnt growth factors in the extracellular space. A third novel protein is termed PAPC which promotes the formation of dorsal mesoderm 
and somites in the embryo. 



FOR THE PURPOSES OP INFORMATION ONLY 



Codes used to identify States party to the PCT on the front pages of pamphlets publishing international applications under the PCT. 



AL 


Albania 


ES 


Spain 


LS 


Lesotho 


SI 


Slovenia 


AT 


Austria 


FR 




LU 


Luxembourg 


SN 


Senegal 


AU 


Australia 


GA 


Gabon 


LV 


Latvia 


sz 


Swaziland 


AZ 


Azerbaijan 


GB 


United Kingdom 


MC 


Monaco 


TD 


Chad 


BA 


Bosnia and Herzegovina 


GE 




MD 


Republic of Moldova 


TC 


Togo 


BB 


Barbados 


GH 


Ghana 


MG 


Madagascar 


TJ 


Tajikistan 


BE 


Belgian 


GN 


Gainea 


MK 


The farmer Yugoslav 


TM 


Turkmenistan 


BF 


Burkina Paso 


GR 


Greece 




Republic of Macedonia 


TR 


Turkey 


BG 


Bulgaria 


HU 


Hangary 


ML 


Mali 


TT 


Trinidad and Tobago 


BJ 


Benin 


IE 


Ireland 


MN 


Mongolia 


UA 


Ukraine 


BR 


Brazil 


a 


hrael 


MR 


Mauritania 


VG 


Uganda 


BY 


Belarus 


IS 


Iceland 


MW 


Malawi 


US 


United States of America 


CA 


Canada 


IT 


luly 


MX 




uz 


Uzbekistan 


CF 


Central African Republic 


JP 


Japan 


NE 


Niger 


VN 


Viet Nam 


CC 


Congo 


KB 


Kenya 


NL 


Netherlands 


YU 


Yugoslavia 


CH 


Switzerland 


KG 


Kyrgyzuan 


NO 


Norway 


zw 


Zimbabwe 


CI 


C6te d'lvoire 


KP 


Democratic People's 


NZ 


New Zealand 






CM 


Cameroon 




Republic of Korea 


PL 


Poland 






CN 


China 


KR 


Republic of Korea 


FT 


Portugal 






CU 


Cuba 


KZ 


Kazakstan 


RO 


Romania 






CZ 


Czech Republic 


LC 


Saint Lucia 


RU 


Russian Federation 






DE 


Germ any 


U 




SD 


Sudan 






DK 


Denmark 


LK 


Sri Lanka 


SE 


Sweden 






BE 


Estonia 


LR 


Liberia 


SG 


Singapore 







WO 97/48275 



PCT/US97/10942 



1 



ENDQDERM, CARDIAC AND 
NEURAL IND UCING FACTORS 



5 Field of the Invention 

The invention generally relates to growth 
factors , neurotrophic factors, and their inhibitors, and 
more particularly to several new growth factors with 
neural, endodennal, and cardiac tissue inducing 
10 activity, to complexes and compositions including the 
factors, and to DNA or RNA coding sequences for the 
factors. Further, one of the novel growth factors 
should be useful in tumor suppression gene therapy* 

This application claims the benefit of U.S. 
15 Provisional Application No. 60/020,150, filed June 20, 
1996. 

This invention was made with Government 
support under grant contract number HD-21502, awarded by 
the National Institutes of Health. The Government has 
20 certain rights in this invention. 

Background of the Invention 

Growth factors are substances, such as 
polypeptide hormones, which affect the growth of defined 
populations of animal cells in vivo or in vitro, but 
25 which are not nutrient substances. Proteins involved in 
the growth and differentiation of tissues may promote or 
inhibit growth, and promote or inhibit differ ntiation, 
and thus the general t rm "growth factor" includes 
cyt kines, trophic factors, and their inhibitors. 
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Widespread neuronal cell death accompanies 
normal development of the central and peripheral nervous 
systems. Studies of peripheral target tissues during 
development have shown that neuronal cell death results 
5 from the competition among neurons for limiting amounts 
of survivor factors ("neurotrophic factors"). The 
earliest identified of these, nerve growth factor 
("NGF"), is the most fully characterized and has been 
shown to be essential for the survival of sympathetic 

10 and neural crest-derived sensory neurons during early 
development of both chick and rat. 

One family of neurotropic factors are the 
Wnts, which have dorsal axis-inducing activity. Most of 
the Wnt proteins are bound to cell surfaces. (See, 

15 e.g., Sokol et al. # Science, 249, pp. 561-564, 1990.) 
Dorsal axis-inducing activity in Xenopus embryos by one 
member of this family (Xwnt-8) was described by Smith 
and Harland in 1991, Cell, 67, pp. 753-765. The authors 
described using RNA injections as a strategy for 

20 identifying endogenous RNAs involved in dorsal 
patterning to rescue dorsal development in embryos that 
were ventralized by UV irradiation. 

Another member of the growth and neurotropic 
factor family was subsequently disco vered and described 

"25 By Harland and Smith, which they termed "noggin." 
(Cell, 70, pp. 829-840 (1992).) Noggin is a good 
candidate to function as a signaling molecule in 
Nieuwkoop's center, by virtue of its maternal 
transcripts, and in Spemann's organizer, through its 

30 zygotic organizer-specific expression. Besides noggin, 
other secreted factors may be involved in the organizer 
phenomenon ♦ 

Another Xenopus gene designated "chordin" that 
begins to be xpressed in Spemann's organizer and that 
35 can completely rescue axial development in ventralized 
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embryos was described by Sasai et al., Cell, 79, pp. 
779-790, 1994. In addition to dorsalizing mesoderm, 
chordin has the ability to induce neural tissue and its 
activities are antagonized by Bone Horphogenetic 
Protein-4 (Sasai et al. r Nature, 376, pp. 333-336, 
1995). 

Therefore, the dorsal lip or Spemann's 
organizer of the Xenopus embryo is an ideal tissue for 
seeking novel growth and neurotrophic factors. New 
growth and neurotrophic factors are useful agents, 
particularly those that are secreted due to their 
ability to be used in physiologically active, soluble 
forms because these factors, their receptors, and dna or 
RNA coding sequences therefore and fragments thereof are 
useful in a number of therapeutic, clinical, research, 
diagnostic, and drug design applications. 

Summary of the Invention 

In one aspect of the present invention, the 
sequence of the novel peptide that can be in 
20 substantially purified form is shown by SEQ ID NO:l. 
The Xenopus derived SEQ ID NO:l has been designated 
w cerberus, " and this peptide is capable of inducing 

endodermal, cardiac, and neural tissue development in 

vertebrates when expressed. The nucleotide sequence 
25 which, when expressed results in cerberus, is 
illustrated by SEQ ID NO: 2. Since peptides of the 
invention induce endodermal, cardiac, and neural tissue 
differentiation in vertebrates, they should be able to 
be prepared in physiologically active form for a number 
30 of therapeutic, clinical, and diagnostic applications. 

Cerberus was isolated during a search for 
molecules expr ssed specifically in Spemann's organizer 
containing a secretory signal sequence. In addition to 
cerberus, two other novel cDNAs were identified. 



10 
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The Xenopus derived peptide that can be 
deduced from SEQ ID NO: 3 encodes a novel protein we had 
earlier designated as "frazzled," a secreted protein of 
318 amino acids that has dorsalizing activity in Xenopus 
5 embryos. We now designate the novel protein as 
"frzb-1. " The gene for frzb-1 is expressed in many 
adult tissues of many animals, three of the cDNAs 
( Xenopus , mouse, and human) have been cloned by us. The 
accession numbers for the Xenopus, mouse, and human 

10 frzb-1 cDNA sequences of the gene now designated frzb-1 
are U68059, U68058, and U68057, respectively. Frzb-1 
has some degree of sequence similarity to the Drosophila 
gene frizzled which has been shown to encode a seven- 
transmembrane protein that can act both as a signalling 

15 and as a receptor protein (Vinson et al., Nature, 338, 
pp. 263-264, 1989? Vinson and Adler, Nature, 329, pp. 
549-551, 1987). Vertebrate horaologues of Frizzled have 
been isolated and they too were found to be anchored to 
the cell membrane by seven membrane spanning domains 

20 (Wang et al., J. Biol. Chem., 271, pp. 4468-4476, 1996). 
Frzb-1 differs from the frizzled proteins in that it is 
an entirely soluble, diffusible secreted protein and 
therefore suitable as a therapeutic agent. The 

nucleotide sequence derived from Xenopus that, when 

25 expressed, results in frzb-1 protein is illustrated by 
SEQ ID NO: 4. The frzb-1 protein derived from mouse is 
shown as SEQ ID NO: 7, while the mouse frzb-1 nucleotide 
sequence is SEQ ID NO: 8. The human derived frzb-1 
protein is illustrated by SEQ ID NO: 9, and the human 

30 frzb-1 nucleotide sequence is SEQ ID NO: 10. 

Frzb-1 is an antagonist of Wnts in vivo, and 
thus is believed to find utility as a tumor suppressor 
gene, since over xpressed Wnt proteins cause cancer. 
Frzb-1 may also be a useful vehicle for solubilization 
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and therapeutic delivery of Writ proteins complexed with 
it. 

The final cDNA isolated containing a signal 
sequence results in a peptide designated Paraxial 
5 Proto^adherin (PAPC). The cDNA for PAPC is a divergent 
member of the cadherin multigene family. PAPC is most 
related to protocadherin 43 reported by Sano et al., The 
EMBO J., 12, pp. 2249-2256, 1993. As shown in SEQ ID 
NO: 5, the PAPC gene encodes a transmembrane protein of 

10 896 amino acids, of which 187 are part of an 
intracellular domain. PAPC is a cell adhesion molecule, 
and microinjection of PAPC mRNA constructs into Xenopus 
embryos suggest that PAPC acts as a molecule involved in 
mesoderm differentiation. A soluble form of the PAPC 

15 extracellular domain is able to block muscle and 
mesoderm formation in Xenopus embryos. The nucleotide 
sequence encoding Xenopus PAPC is provided in SEQ ID 
NO: 6. 

Cerberus, frzb-1, or PAPC or fragments thereof 
20 (which also may be synthesized by in vitro methods) may 
be fused (by recombinant expression or in vitro covalent 
methods) to an immunogenic polypeptide and this, in 
turn, may be used to immunize an animal in order to 

raise antibodies against the novel proteins. Antibodies 

25 are recoverable from the serum of immunized animals. 
Alternatively, monoclonal antibodies may be prepared 
from cells from the immunized animal in conventional 
fashion. Immobilized antibodies are useful particularly 
in the diagnosis (in vitro or in vivo) or purification 
30 of cerberus, frzb-1, or PAPC. 

Substitutional, deletional, or insertional 
mutants of the novel polypeptides may be prepared by in 
vitro or recombinant methods and screened for immuno- 
crossreactivity with cerberus, frzb-1, or PAPC and for 
35 cerberus antagonist or agonist activity. 
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Cerberus or frzb-1 also may be derivatized in 
vitro in order to prepare immobilized and labelled 
proteins, particularly for purposes of diagnosis of 
insufficiencies thereof, or for affinity purification of 
5 antibodies thereto. 

Among applications for the novel proteins are 
tissue replacement therapy and, because frzb-1 is an 
antagonist of Wnt signaling, tumor suppression 
therapies. The cerberus receptor may define a novel 
10 signalling pathway. in addition, frzb-1 could permit 
the isolation of novel members of the Wnt family of 
growth factors. 

Brief Description of the Drawings 

Figure 1 illustrates the amino acid sequence 
15 (SEQ ID N0:1) of the Fig. 2 cDNA clone for cerberus; 

Figure 2 illustrates a cDNA clone (SEQ ID 
NO: 2) for cerberus derived from Xenopus. Sense strand 
is on top (5' to 3' direction) and the antisense strand 
on the bottom line (in the opposite direction); 
20 Figures 3 and 4 show the amino acid and 

nucleotide sequence, respectively, of full-length frzb-1 
from Xenopus (SEQ ID N0S:3 and 4); 

Figures 5 and 6 show the amino acid and 

nucleotide sequence, respectively, of full-length PAPC 
25 from Xenopus (SEQ ID N0S:5 and 6); 

Figures 7 and 8 show the amino acid and 
nucleotide sequence, respectively, of full-length frzb-1 
from mouse (SEQ ID N0S:7 and 8); and 

Figures 9 and 10 show the amino acid and 
30 nucleotide sequence, respectively, of full-length frzb-1 
from human (SEQ ID NOS:9 and 10). 
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Detailed Description of the Preferred Embodiments 

Among the several novel proteins and their 
nucleotide sequences described herein, is a novel 
endodermal, cardiac, and neural inducing factor in 
5 vertebrates that we have named "cerberus." When 
referring to cerberus, the present invention also 
contemplates the use of fragments, derivatives, 
agonists, or antagonists of cerberus molecules. Because 
cerberus has no homology to any reported growth factors, 

10 it is proposed to be the founding member of a novel 
family of growth factors with potent biological 
activities, which may be isolated using SEQ ID NO: 2. 

The amphibian organizer consists of several 
cell populations with region-specific inducing 

15 activities. On the basis of morphogenetic movements, 
three very different cell populations can be 
distinguished in the organizer. First, cells with 
crawling migration movements involute, fanning out to 
form the prechordal plate. Second, cells involute 

20 through the dorsal lip driven by convergence and 
extension movements, giving rise to the notochord of the 
trunk. Third, involution ceases and the continuation of 
mediolateral intercalation movements leads to posterior 

extension movements and to the formation of the tail 

25 notochord and of the chordoneural hinge. The three cell 
populations correspond to the head, trunk, and tail 
organizers, respectively. 

The cerberus gene is expressed at the right 
time and place to participate in cell signalling by 

30 Spemann's organizer. Specifically, cerberus is 
expressed in the head organizing region that consists of 
crawling-migrating cells. The cerberus expressing 
region corresponds to the prospective foregut, including 
the liver and pancreas anlage, and the heart mesoderm. 
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Cerberus expression is activated by chordin, noggin, and 
organizer-specific homeobox genes. 

Our studies were conducted in early embryos of 
the frog Xenopus laevis. The frog embryo is well suited 
5 to experiments, particularly experiments pertaining to 
generating and maintaining regional differences within 
the embryo for determining roles in tissue differentia- 
tion. It is easy to culture embryos with access to the 
embryos even at very early stages of development 

10 (preceding and during the formation of body pattern and 
differentiation) and the embryos are large. The initial 
work with noggin and chordin also had been in Xenopus 
embryos, and, as predicted, was highly conserved among 
vertebrates. Predictions based on work with Xenopus as 

15 to corresponding human noggin were proven true and the 
ability to clone the gene for human noggin was readily 
accomplished. (See the description of Xenopus work and 
cloning information in PCT application, published March 
17, 1994, WO 9 405 800, and the subsequent human cloning 

20 based thereon in the PCT application, also published 
March 17, 1994, as WO 9 405 791.) 

CLONING 



The cloning of cerberus, frzb-1, and PAPC 
resulted from a comprehensive screen for cDNAs enriched 

25 in Spemann's organizer. Subtractive differential 
screening was performed as follows, in brief, poly A* 
RNA was isolated from 300 dorsal lip and ventral 
marginal zone (VMZ) explants at stage 10*$. After first 
strand cDNA synthesis approximately 70-80% of common 

30 sequences were removed by subs traction with biotinylated 
VMZ poly A* RNA prepared from 1500 ventral gastrula 
halves. For differential screening, duplicate filters 
(2000 plaques per 15 cm plate, a total of 80,000 clones 
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screened) of an unamplified oriented dorsal lip library 
were hybridized with radiolabeled dorsal lip or VMZ 
cDNA. Putative organizer-specific clones were isolated , 
grouped by sequence analysis from the 5' end and whole- 
5 mount in situ hybridization, and subsequently classified 
into known and new dorsal-specific genes* Rescreening 
of the library (100,000 independent phages) with a 
cerberus probe resulted in the isolation of 45 
additional clones, 31 of which had similar size as the 
10 longest one of the 11 original clones indicating that 
they were presumably full-length cDNAs. The longest 
CDNAs for cerberus, frzb-1, and PAPC were completely 
sequenced. 

To explore the molecular complexity of 
15 Spemann's organizer we performed a comprehensive 
differential screen for dorsal-specific cDNAs. The 
method was designed to identify abundant cDNAs without 
bias as to their function. As shown in Table 1, five 
previously known cDNAs and five new ones were isolated, 
20 of which three (expressed as cerberus, frzb-1, and PAPC, 
respectively) had secretory signal sequences. 
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TABLE 1 



Previously Known Genes Gene Product No. of Isolates 

Chordin novel secreted protein 70 

Goosecoid homeobox gene 3 

5 Pintallavis/XFKH-1 forkheadAranscription factor 2 

Xnot-2 homeobox gene 1 

Xlim-1 homeobox gene 1 

New Genes 

Cerberus novel secreted protein 1 1 

10 PAPC cadherin-likeAransmembrane 2 

Frzb-1 novel secreted protein 1 

Sox-2 sry/transcription factor 1 

Fkh-like forkheadAranscription factor 1 



The most abundant dorsal-specific cDNA was 

15 chordin (chd), with 70 independent isolates. The second 
most abundant cDNA was isolated 11 times and named 
cerberus (after a mythological guardian dog with 
multiple heads). The cerberus cDNA encodes a putative 
secreted polypeptide of 270 amino acids, with an amino 

~7U tertnihal hydtbphdbic signal geqtiettCe and a earboxy 

terminal cysteine-rich region (Fig. 1). Cerberus is 
expressed specifically in the head organizer region of 
the Xenopus embryo, including the future foregut. 

An abundant mRKA found in the dorsal region of 

25 the Xenopus gastrula encodes the novel putative secreted 
protein we have designated as cerberus. Cerberus mRNA 
has potent inducing activity in Xenopus embryos, leading 
to the formation of ectopic heads. Unlike other 
organizer-specific factors, cerberus does not dorsalize 

30 mesoderm and is instead an inhibitor of trunk-tail 
mesoderm, cerberus is expressed in the anterior -most 
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domain of the gastrula including the leading edge of the 
deep layer of the dorsal lip a region that, as shown 
here, gives rise to foregut and midgut endoderm. 
Cerberus promotes the formation of cement gland, 
5 olfactory placodes, cyclopic eyes, forebrain, and 
duplicated heart and liver (a foregut derivative). 
Because the pancreas is also derived from this foregut 
region, it is likely that cerberus induces pancreas in 
addition to liver. The expression pattern and inducing 
10 activities of cerberus suggest a role for a previously 
neglected region of the embryo, the prospective foregut 
endoderm, in the induction of the anterior head region 
of the embryo. 

Turning to Fig. 1, Xenopus cerberus encodes a 
15 putative secreted protein transiently expressed during 
embryogenesis and the deduced amino acid sequence of 
Xenopus cerberus is shown. The signal peptide sequence 
and the nine cysteine residues in the carboxy-terminus 
are indicated in bold. Potential N-linked glycosylation 
20 sites are underlined. In database searches the cerberus 
protein showed limited similarity only to the mammalian 
Dan protein, a possible tumor suppressor proposed to be 
a DNA-binding protein. 

Cerberus appears to be a pioneer protein, as 

its amino acid sequence and the spacing of its 
9 cysteine residues were not significantly similar to 
other proteins in the databases (NCBI-Gen Bank release 
93.0). We conclude that the second most abundant 
dorsal-specific cDNA encodes a novel putative secreted 
factor, which should be the founding member of a novel 
family of growth factors active in cell differentiation. 

Cerberus Demarcates an Anteri or Organizer 
Domain . Cerberus mRKA is expressed at low levels in the 
unfertilized egg, and zygotic transcripts start 
accumulating at early gastrula. Expression continues 



30 
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during gastrula and early neurula, rapidly declining 
during neurulation. importantly, cerberus expression 
starts about one hour after that of chd, suggesting that 
cerberus could act downstream of the chd signal. 
5 Whole-mount in situ hybridizations reveal that 

expression starts in the yolky endomesodermal cells 
located in the deep layer of the organizer. The 
cerberus domain includes the leading edge of the most 
anterior organizer cells and extends into the lateral 

10 mesoderm. The leading edge gives rise to liver, 
pancreas, and foregut in its midline, and the more 
lateral region gives rise to heart mesoderm at later 
stages of development. 

Fig. 2 sets out the sequence of a full length 

15 Xenopus cDMA for cerberus. 

This entirely new molecule has demonstrated 
physiological properties that should prove useful in 
therapeutic, diagnostic, and clinical applications that 
require regeneration, differentiation, or repair of 

20 tissues, such wound repair, neuronal regenerational or 
transplantation, supplementation of heart muscle 
differentiation, differentiation of pancreas and liver, 
and other applications in which cell differentiation 

processes are to be induced. 

25 The second, novel, secreted protein we have 

discovered is called "frzb-1," which was shown to be a 
secreted protein in Xenopus oocyte microinjection 
experiments. Thus it provides a natural soluble form of 
the related extracellular domains of Drosophila and 

30 vertebrate frizzled proteins. we propose that the 
latter proteins could be converted into active soluble 
forms by introducing a stop codon before the first 
transmembrane domain. We have noted that the cysteine* 
rich region of frzb-1 and frizzl d contains some overall 

35 structural homology with Wnt proteins using the Profile 
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Search homology program (Gribskov, Meth. Enzymol., 183, 
pp. 146-159, 1990). This had raised the interesting 
possibility that frzb-1 could interact directly with Wnt 
growth factors in the extracellular space. This was 
5 because we had found that when microinjected into 
Xenopus embryos, frzb-1 constructs have moderate 
dorsalizing activity, leading to the formation of 
embryos with enlarged brain and head, and shortened 
truck. Somatic muscle differentiation, which requires 

10 Xwnt-8, was inhibited. In the case of frzb-1, an 
attractive hypothesis, suggested by the structural 
homologies, was that it may act as an inhibitor of 
Wnt-8, a growth factor that has ventralizing activity in 
the Xenopus embryo (Christian and Moon, Genes Dev., 7, 

15 pp. 13-28, 1993). We have shown that frzb-1 can 
interact with Xwnt-8 and Wnt-1, and it is expected that 
it could also interact with other members of the Wnt 
family of growth factors, of which at least 15 members 
exist in mammals. In addition, a possible interaction 

20 with wnts was suggested by the recent discovery that 
dishevelled, a gene acting downstream of wingless, has 
strong genetic interaction with frizzled mutants in 
Drosophila (Krasnow et al. , Development, 121, pp. 4095- 

4102, 1995). This possibility has been explored in 

25 depth (Leyns et al., Cell, 88, pp. 747-756, March 21, 
1997), because a soluble antagonist of the Wnt family of 
proteins is expected to be of great therapeutic value. 
Examples 1 and 2 illustrate tests that show antagonism 
of Xwnt-8 by binding to frzb-1. 

30 Vertebrate homologues of Frizzled have been 

isolated and they too are anchored to the cell membrane 
by seven membrane spanning domains (Wang et al., 
J . Biol. Chem., 271, pp. 4468-4476, 1996). Frzb-1 
differs from the frizzled proteins in that it is an 

35 entirely soluble, diffusible secreted protein and 
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therefore suitable as a therapeutic agent. The 
nucleotide sequence that when expressed results in 
frzb-1 protein is illustrated by SEQ ID NO: 4. 

SEQ id NO: 4 corresponds to the Xenopus 
5 homolog, but by using it in BLAST searches (and by 
cloning mouse frzb-1) we had been able to assemble the 
sequence of the entire mature human frzb-1 protein, SEQ 
ID NO: 9. Indeed, human frzb-1 is encoded in six 
expressed sequence tags (ESTs) available in Genebank. 

10 The human frzb-1 sequence can be assembled by 
overlapping in the 5 1 to 3' direction the ESTs with the 
following accession numbers in Genebank: H18848, 
R63748, W38677, W44760, H38379, and N71244. No function 
had yet been assigned to these EST sequences, but we 

15 believe and thus propose here that human frzb-1 will 
have similar functions in cell differentiation to those 
described above for Xenopus frzb-1. The nucleotide 
sequence of human frzb-1 is shown in SEQ ID NO: 10 . The 
mouse frzb-1 protein and nucleotide sequences are 

20 provided by SEQ ID N0S:7 and 8, respectively. 

in particular, we believe that frzb-1 will 
prove useful in gene therapy of human cancer cells. In 
this rapidly developing field, one approach is to 

Introduce vectors expressing anti-sense sequences to 

25 block expression of dominant ocogenes and growth factor 
receptors. Another approach is to produce episomal 
vectors that will replicate in human cells in a 
controlled fashion without transforming the cells. For 
an example of the latter (an episomal expression vector 

30 system for human gene therapy), reference is made to 
U.S. Patent 5,624,820, issued April 29, 1997, inventor 
Cooper . 

Gene therapy now includes uses of human tumor 
suppression genes. For example, U.S. Patent 5,491,064, 
35 issued February 13, 1996, discloses a tumor suppression 
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gene localized on chromosome 11 and described as 
potentially useful for gene therapy in cancers deleted 
or altered in their expression of that gene. Frzb-1 
maps to chromosome 2q31-33 and loss of one copy of the 
5 2q31-33 and loss of one copy of the 2q arm has been 
observed with high incidence in lung carcinomas, 
colo-rectal carcinomas, and neuroblastomas, which has 
lead to the proposal that the 2q arm carries a tumor 
suppressor gene. We expect frzb to be a tumor 

10 suppressor gene, and thus to be useful in tumor 
suppression applications. 

A number of applications for cerberus and 
frzb-1 are suggested from their pharmacological 
(biological activity) properties. 

15 For example, the cerberus and frzb-1 cDNAs 

should be useful as a diagnostic tool (such as through 
use of antibodies in assays for proteins in cell lines 
or use of oligonucleotides as primers in a PCR test to 
amplify those with sequence similarities to the 

20 oligonucleotide primer, and to determine how much of the 
novel protein is present). 

Cerberus, of course, might act upon its target 
cells via its own receptor. Cerberus, therefore, 

provides the key to isolate this receptor. Since many 

25 receptors mutate to cellular oncogenes, the cerberus 
receptor should prove useful as a diagnostic probe for 
certain tumor types. Thus, when one views cerberus as 
ligand in complexes, then complexes in accordance with 
the invention include antibody bound to cerberus, 

30 antibody bound to peptides derived from cerberus, 
cerberus bound to its receptor, or peptides derived from 
cerberus bound to its receptor or other factors. Mutant 
forms of cerberus, which are either more potent agonists 
or antagonists, are beli ved to be clinically useful. 
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Such complexes of cerberus and its binding protein 
partners will find uses in a number of applications. 

Practice of this invention includes use of an 
oligonucleotide construct comprising a sequence coding 
5 for cerberus or frzb-1 and for a promoter sequence 
operatively linked in a mammalian or a viral expression 
vector* Expression and cloning vectors contain a 
nucleotide sequence that enables the vector to replicate 
in one or more selected host cells. Generally, in 

10 cloning vectors this sequence is one that enables the 
vector to replicate independently of the host 
chromosomes, and includes origins of replication or 
autonomously replicating sequences* The well-known 
plasmid pBR322 is suitable for most gram negative 

15 bacteria, the 2\x plasmid origin for yeast and various 
viral origins (SV40, polyoma, adenovirus, VSV or BPV) 
are useful for cloning vectors in mammalian cells* 

Expression and cloning vectors should contain 
a selection gene, also termed a selectable marker. 

20 Typically, this is a gene that encodes a protein 
necessary for the survival or growth of a host cell 
transformed with the vector. The presence of this gene 
ensures that any host cell which deletes the vector will 

not obtain an advantage in growth or reproduction over 

25 transformed hosts* Typical selection genes encode 
proteins that (a) confer resistance to antibiotics or 
other toxins, e.g. ampicillin, neomycin, methotrexate or 
tetracycline, (b) complement auxotrophic deficiencies. 

Examples of suitable selectable markers for 

30 mammalian cells are dihydrof olate reductase (DHFR) or 
thymidine kinase. Such markers enable the identifica- 
tion of cells which were competent to take up the 
cerberus nucleic acid. The mammalian cell transformants 
ar placed under selection pressure which only the 
35 transf rmants are uniquely adapted to survive by virtue 
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of having taken up the marker. Selection pressure is 
imposed by culturing the trans formants under conditions 
in which the concentration of selection agent in the 
medium is successively changed. Amplification is the 
5 process by which genes in greater demand for the 
production of a protein critical for growth are 
reiterated in tandem within the chromosomes of 
successive generations of recombinant cells. Increased 
quantities of cerberus or frzb-1 can therefor be 

10 synthesized from the amplified DNA. 

For example, cells transformed with the DHFR 
selection gene are first identified by culturing all of 
the transformants in a culture medium which contains 
methotrexate (Mtx), a competitive antagonist of DHFR. 

15 An appropriate host cell in this case is the Chinese 
hamster ovary (CH0) cell line deficient in DHFR 
activity, prepared and propagated as described by Urlaub 
and Chasin, Proc. Nat. Acac. Sci., 77, 4216 (1980). The 
transformed cells then are exposed to increased levels 

20 of Mtx. This leads to the synthesis of multiple copies 
of the dhfr gene and, concomitantly, multiple copies of 
other DNA comprising the expression vectors, such as the 
DNA encoding cerberus or frzb-1. Alternatively, host 

cells transformed by an expression vector comprising DNA 

25 sequences encoding cerberus or frzb-1 and aminoglycoside 
3' phosphotransferase (APH) protein can be selected by 
cell growth in medium containing an aminog lycos idic 
antibiotic such as kanamycin or neomycin or G418. 
Because eukaryotic cells do not normally express an 

30 endogenous APH activity, genes encoding APH protein, 
commonly referred to as neo resistant genes, may be used 
as dominant selectable markers in a wide range of 
eukaryotic host cells, by which cells transformed by the 
vector can readily be identified. 
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Expression vectors, unlike cloning vectors, 
should contain a promoter which is recognized by the 
host organism and is operably linked to the cerberus 
nucleic acid. Promoters are untranslated sequences 
5 located upstream from the start codon of a structural 
gene (generally within about 100 to 1000 bp) that 
control the transcription and translation of nucleic 
acid under their control. They typically fall into two 
classes, inducible and constitutive. Inducible 

10 promoters are promoters that initiate increased levels 
of transcription from DNA under their control in 
response to some change in culture conditions, e.g. the 
presence or absence of a nutrient or a change in 
temperature. At this time a large number of promoters 

15 recognized by a variety of potential host cells are well 
known. These promoters can be operably linked to 
cerberus encoding DNA by removing them from their gene 
of origin by restriction enzyme digestion, followed by 
insertion 5* to the start codon for cerberus or frzb-1. 

20 Nucleic acid is operably linked when it is 

placed into a functional relationship with another 
nucleic acid sequence. For example, DNA for a 
presequence or secretory leader is operably linked to 
DN A for a polypeptide if it is expressed as a preprotein 

25 which participates in the secretion of the polypeptide; 
a promoter or enhancer is operably linked to a coding 
sequence if it affects the transcription of the 
sequence; or a ribosame binding site is operably linked 
to a coding sequence if it is positioned so as to 

30 facilitate translation. Generally, operably linked 
means that the DNA sequences being linked are contiguous 
and, in the case of a secretory leader, contiguous and 
in reading phase. Linking is accomplished by ligation 
at conveni nt restriction sites. If such sites do not 
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exit then synthetic oligonucleotide adapters or linkers 
are used in accord with conventional practice. 

Transcription of the protein-encoding DNA in 
mammalian host cells is controlled by promoters obtained 
5 from the genomes of viruses such as polyoma, cytomegalo- 
virus, adenovirus, retroviruses, hepatitis-B virus, and 
most preferably Simian virus 40 (SV40), or from 
heterologous mammalian promoters, e.g. the actin 
promoter. Of course, promoters from the host cell or 
10 related species also are useful herein. 

Cerberus and frzb-1 are clearly useful as a 
component of culture media for use in culturing cells, 
such as endodennal, cardiac, and nerve cells , in vitro. 
We believe cerberus and frzb-1 will find uses as agents 
15 for enhancing the survival or inducing the growth of 
liver, pancreas, heart, and nerve cells, such as in 
tissue replacement therapy. 

The final cDNA isolated containing a signal 
sequence results in a peptide designated Paraxial 
20 Protocadherin (PAPC). The cDNA for PAPC is a divergent 
member of the cadherin multigene family. PAPC is most 
related to protocadherin 43 reported by Sano et al., The 
EMBO J. , 12, pp. 2249-2256, 1993. As shown in SEQ ID 
NO: 5, the PAPC gene encodes a transmembrane protein of 
896 amino acids, of which 187 are part of an 
intracellular domain. PAPC is a cell adhesion molecule, 
and microinjection of PAPC mRNA constructs into Xenopus 
embryos suggest that PAPC acts in mesoderm 
differentiation. The nucleotide sequence encoding 
xenopus PAPC is provided in SEQ ID NO: 6. 

Therapeutic formulations of the novel proteins 
may be prepared for storage by mixing the polypeptides 
having the desired degree of purity with optional 
physiologically acc ptable carriers, excipients or 
stabilizers, in the form of lyophilized cake or aqueous 
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solutions. Acceptable carriers, excipients or 
stabilizers are nontoxic to recipients at the dosages 
and concentrations employed, and include buffers such as 
phosphate, citrate, and other organic acids; anti- 
5 oxidants including ascorbic acid; low molecular weight 
(less than about 10 residues) polypeptides; proteins, 
such as serum albumin, gelatin or immunoglobulins. 
Other components can include glycine, blutamine, 
asparagine, arginine, or lysine; monosaccharides, 

10 disaccharides, and other carbohydrates including 
glucose, mannose, or dextrins; chelating agents such as 
EDTA; sugar alcohols such as mannitol or sorbitol; salt- 
forming counterions such as sodium; and/or nonionic 
surfactants such as Tween, Pluronics or PEG* 

15 Polyclonal antibodies to the novel proteins 

generally are raised in animals by multiple subcutaneous 
(sc) or intraperitoneal (ip) injections of cerberus or 
frzb-1 and an adjuvant. It may be useful to conjugate 
these proteins or a fragment containing the target amino 

20 acid sequence to a protein which is immunogenic in the 
species to be immunized, e.g., keyhole limpet 
hemocyanin, serum albumin, bovine thyroglobulin, or 
soybean trypsin inhibitor using a bifunctional or 
der i vat i zing agent, for example, m aleimidobenzoyl 

"55 sulfosuccinimide ester (conjugation through cysteine 
residues), N-hydroxysuccinimide (through lysine 
residues), glutar aldehyde, succinic anhydride, S0C1 2 , or 
R*N « C = NR. 

Animals can be immunized against the immuno- 

30 genie conjugates or derivatives by combining 1 mg or 1 
jjg of conjugate (for rabbits or mice, respectively) 
with 3 volumes of Freund's complete adjuvant and 
injecting the solution intradermally in multiple sites. 
One month later the animals are boosted with 1/5 to 1/10 
35 the original amount of conjugate in Fruend's complete 
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adjuvant by subcutaneous injection at multiple sites. 
Seven to 14 days later animals are bled and the serum is 
assayed for anti-cerberus titer. Animals are boosted 
until the titer plateaus. Preferably, the animal is 
5 boosted with the conjugate of the same cerberus or 
frzb-1 polypeptide, but conjugated to a different 
protein and/or through a different cross-linking agent. 
Conjugates also can be made in recombinant cell culture 
as protein fusions. Also, aggregating agents such as 

10 alum are used to enhance the immune response. 

Monoclonal antibodies are prepared by 
recovering spleen cells from immunized animals and 
immortalizing the cells in conventional fashion, e.g. by 
fusion with myeloma cells or by EB virus transformation 

15 and screening for clones expressing the desired 
antibody. 

Antibodies are useful in diagnostic assays for 
cerberus, frzb-1, or PAPC or their antibodies and to 
identify family members. in one embodiment of a 

20 receptor binding assay, an antibody composition which 
binds to all of a selected plurality of members of the 
cerberus family is immobilized on an insoluble matrix, 
the test sample is contacted with the immobilized 
antibody composition in order to adsorb all cerberus 

75 tamiiy members, and then the immobilized family members 
are contacted with a plurality of antibodies specific 
for each member, each of the antibodies being 
individually identifiable as specific for a predeter- 
mined family member, as by unique labels such as 

30 discrete fluorophores or the like. By determining the 
presence and/or amount of each unique label, the 
relative proportion and amount of each family member can 
be determined. 

The antibodies also are useful for the 

35 affinity purification of the novel proteins from 
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recombinant cell culture or natural sources. Antibodies 
that do not detectably cross-react with other growth 
factors can be used to purify the proteins free from 
these other family members. 

5 EXAMPLE 1 

Frzb-1 Antagonizes Xwnt-8 Non-Cell Autonomously 

To test whether frzb-1 can antagonize 
secondary axes caused by Xwnt-8 after secretion by 
injected cells, an experimental design was used* Thus, 

10 frzb-1 mRNA was injected into each of the four animal 
blastomeres of eight-cell embryos, and subsequently, a 
single injection of Xwnt-8 mRNA was given to a vegetal- 
ventral blastomere at the 16-32 cell stage. In two 
independent experiments, we found that injection of 

15 frzb-1 alone (n=13) caused mild dorsalization with 
enlargement of the cement gland in all embryos and that 
injection of Xwnt-8 alone (n=53) lead to induction of 
complete secondary axes in 67% of the embryos. However, 
injection of frzb-1 into animal caps abolished the 

20 formation of complete axes induced by Xwnt-8 (n=27), 
leaving only a residual 14% of embryos with very weak 
secondary axes. The double-injected embryos retained 
the enlarged cement gland phenotype caused by injection 
of frzb-1 mRNA alone. Because both mRNAs encode 

25 secreted proteins and were microinjected into different 
cells, we conclude that the antagonistic effects of 
frzb-1 and Xwnt-8 took place in the extracellular space 
after these proteins were secreted. 
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Membrane-Anchored Wnt-1 Confers Frzb-l Binding 

To investigate a possible interaction between 
frzb-l and Wnts, the first step was to insert an HA 
5 epitope tag into a Xenopus frzb-l construct driven by 
the CMV (cytomegalovirus) promoter. Frzbl-HA was tested 
in mRNA microinjection assays in Xenopus embryos and 
found to be biologically active. Conditioned medium 
from transiently transfected cells contained up to 10 

10 /ig/ml of Frzbl-HA (quantitated on Western blots using an 
HA- tagged protein standard)* 

Transient transfection of 293 cells has been 
instrumental in demonstrating interactions between 
wingless and frizzled proteins. We therefore took 

15 advantage of constructs in which Wnt-1 was fused at the 
amino terminus of CD8, generating a transmembrane 
protein containing biologically active Wnt-1 exposed to 
the extracellular compartment. A WntlCD8 cDNA construct 
(a generous gift of Dr. H. Varmus, NIH) was subcloned 

20 into the pcDNA (invitrogen) vector and transfected into 
293 cells. After incubation with Frzbl-HA-conditioned 
medium (overnight at 37°C), intensely labeled cells were 
observed by immunofluorescence. As a negative control, 
a construct containing 120 amino acids of Xenopus 

25 chordin, an unrelated secreted protein was used. 
Transfection of this construct produced background 
binding of Frzbl-HA to the extracellular matrix, both 
uniform and punctate. Cotrans feet ion of WntlCD8 with 
pcDNA-LacZ showed that transfected cells stained 

30 positively for Frzbl-HA and LacZ. Since WntlCD8 
contains the entire CD8 molecule, a CD 8 cDNA was used as 
an additional negative control. After transfection with 
LacZ and full-length CE8, Frzbl-HA failed to bind to the 
transfected cells. Although most of our experiments 
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were carried out at 37°C, Frzbl-HA-conditioned medium 
also stained WntlCDB-transfected cells after incubation 
at 4°C for 2 hours. 

Attempts to biochemically quant itate the 
5 binding of Frzb-1 to WntlCD8-transfected cells were 
unsuccessful due to high background binding to control 
cultures, presumably due to binding to the extracellular 
matrix. Thus, we were unable to estimate a K D for the 
affinity of the Frzb-l/Wnt-1 interaction. However, when 

10 serial dilutions of conditioned medium containing 
Frzbl-HA were performed (ranging from 2.5 x 10* 7 to 1.25 
x 10" 10 M), staining of WntlCD8-transfected cells was 
found at all concentrations. 

Although we have been unable to provide 

15 biochemical evidence for direct binding between Wnts and 
frzb-1, this cell biological assay indicates that 
Frzbl-HA can bind, directly or indirectly, to Wnt-1 on 
the cell membrane in the 10* 10 M range. 



It is to be understood that while the 
20 invention has been described above in conjunction with 
preferred specific embodiments, the description and 
examples are intended to illustrate and not limit the 
scope of the invention, which is defined by the scope of 
the appended claims. 
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It is Claimed ; 

1. A substantially pure protein 
characterized by a physiologically active form and 
comprising an amino acid sequence encoded by the DNA of 
SEQ ID N0:2. 

2. The protein as in claim 1 having 
neurotrophic , growth or differentiation factor activity. 

3. A composition comprising the protein of 
claim 1 and a physiologically acceptable carrier with 
which the peptide is admixed. 

4. An oligonucleotide construct comprising 
a sequence coding for a protein and an expression vector 
operatively linked therewith, the protein having 
neurotrophic, growth or differentiation factor activity 

5 and being expressible from SEQ ID NO: 2. 

5. The construct as in claim 4 wherein the 
expression vector is a mammalian or viral expression 
vector • 

G"I X substantially pure protein 
characterized by a physiologically active form and 
comprising an amino acid sequence encoded by the DNA of 
SEQ ID N0:4, SEQ ID N0:8, or SEQ ID NO:10. 

7. The protein as in claim 6 having 
neurotrophic, growth or differentiation factor activity. 

8. A composition comprising the protein of 
claim 6 and a physiologically acceptable carrier with 
which the protein is admixed. 
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9. An oligonucleotide construct comprising 
a sequence coding for a protein and an expression vector 
operatively linked therewith, the protein being 
expressible from SEQ id NO: 4, SEQ ID NO: 8 or SEQ ID 
5 NO:10. 

10* The construct as in claim 9 wherein the 
protein is expressible in soluble form. 

11. The construct as in claim 9 wherein the 
expression vector is a mammalian or viral expression 
vector . 

12. A complex comprising a substantially pure 
frzb-1 protein complexed with at least one Wnt protein. 

13. A substantially pure protein 
characterized by a physiologically active form and 
comprising an amino acid sequence encoded by the DNA of 
SEQ ID NO: 6. 

14. The protein as in claim 13 having 
mesoderm differentiation activity. 

15. A composition comprising the protein of 
claim 13 and a physiologically acceptable carrier with 
which the protein is admixed. 
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MLLNVLRICI IVCLVNDGAG KHSEGRERTK TYSLNSRGYF 40 

RKERGARRSK ILLVNTKGLD EPHIGHGDFG LVAELFDSTR 80 

THTNRKEPDM NKVKLFSTVA HGNKSARRKA YNGSRRNIFS 120 

RRSFDKRNTE VTEKPGAKMF WNNFLVKMNG APQNTSHGSK 160 

AQEIMKEACK TLPFTQNIVH ENCDRMVTQN NLCFGKCISL 200 

HVPNQQDRRN TCSHCLPSKF TLNHLTLNCT GSKNWKWM 240 

MVEECTCEAH KSNFHQTAQF NMDTSTTLHH 270 
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GAATTCCCAG CAAGTCGCTC AGAAACACTG CAGGGTC7AG ATATCATACA ATGTTACTAA 60 
CTTAAGGGTC GTTCAGCGAG TCTTTGTGAC GTCCCAGATC TATAGTATGT TACAATGATT 

ATGTACTCAG GATCTGTATT ATCGTCTGCC TTGTGAATGA TGGAGCAGGA AAACACTCAG 120 
TACATGAGTC CTAGACATAA TAGCAGACGG AACACTTACT ACCTCGTCCT TTTGTGAGTC 

AAGGACGAGA AAGGACAAAA ACATATTCAC TTAACAGCAG AGGTTACTTC AGAAAAGAAA 180 
TTCCTGCTCT TTCCTGTTTT TGTATAAGTG AATTGTOGTC TCCAATGAAG TCTTTTCTTT 

GAGGAGCACG TAGGAGCAAG ATTCTGCTGG TGAATACTAA AGGTCTTGAT GAACCCCACA 240 
CTCCTCGTGC ATCCTCGTTC TAAGACGACC ACTTATGATT TCCAGAACTA CTTGGGGTGT 

TTGGGCATGG TGATTTTCGC TTAGTAGCTG AACTATTTGA TTCCACCAGA ACACATACAA 300 
AACCCGTACC ACTAAAAGCG AATCATCGAC TTGATAAACT AAGGTGGTCT TGTGTATGTT 

ACAGAAAAGA GCCAGACATG AACAAAGTCA AG C TTTTCTC AACAGTTGCC CATGGAAACA 360 
TGTCTTTTCT CGGTCTGTAC TTGTTTCAGT TOGAAAAGAG TTGTCAACGG GTACCTTTGT 

AAAGTGCAAG AAGAAAAGCT TACAATGGTT CTAGAAGGAA TATTTTTCCT CGCCGTTCTT 420 
TTTCACGTTC TTCTTTTCGA ATGTTACCAA GATCTTCCTT ATAAAAAGGA GCGGCAAGAA 

TTGATAAAAG AAATACAGAG GTTACTGAAA AGCCTGGTGC CAAGATGTTC TGGAACAATT 480 
AACTATTTTC TTTATGTCTC CAATGACTTT TCGGAOCAOG GTTCTACAAG AOCTTGTTAA 

TTTTGGTTAA AATGAATGGA GCCCCACAGA ATACAAGCCA TGGCAGTAAA GCACAGGAAA 540 
AAAACCAATT TTACTTACCT CGGGGTGTCT TATGTTOGGT ACCGTCATTT CGTGTCCTTT 

TAATGAAAGA AGCTTGCAAA ACCTTGTTTT TCACTCAGAA TATTGTACAT GAAAACTGTG 600 
ATTACTTTCT TCGAACGTTT TGGAACAAAA AGTGAGTCTT ATAACATGTA CTTTTGACAC 

ACAGGATGGT GATACAGAAC AATCTGTGCT TTGGTAAATG CATCTCTCTC CATGTTCCAA 660 
TGTCCTAOCA CTATGTCTTG TTAGACAOGA AACCATTTAC GTAGAGAGAG GTACAAGGTT 

ATCAGCAAGA TCGACGAAAT ACTTGTTCCC ATTGCTTGOC GTCCAAATTT ACCCTGAACC 720 
TAGTOGTTCT AGCTGCTTTA TGAACAAGGG TAACGAAOGG CAGGTTTAAA TGGGACTTGG 

AOCTGACGCT GAATTGTACT GGATCTAAGA ATGTAGTAAA GGTTGTCATG ATGGTAGAGG 780 
TGGACTGCGA CTTAACATGA CCTAGATTCT TACATCATTT CCAACAGTAC TACCATCTCC 

AATGCACGTG TGAAGCrCAI AAUAiyjAAlT TCCACCAAAC TUaCAt,tWP JflOlffl&KlA gTO" 

TTACGTGCAC ACTTCGAGTA TTCTOGTTGA AGGTGGTTTG ACGTGTCAAA TTGTAOCTAT 

CATCTACTAC CCTGCACCAT TAAAGGACTG CCATACAGTA TGGAAATGCC CTTTTGTTGG 
GTAGATGATG GGACGTGGTA ATTTCCTGAC GGTATGTCAT ACCTTTACGG GAAAACAACC 

AATA T TTGT T ACATACTATG CATCTAAAGC ATTATGTTGC CTTCTATTTC ATATAACCAC 
TTATAAACAA TGTATGATAC GTAGATTTCG TAATACAAOG GAAGATAAAG TATATTGGTG 

ATGGAATAAG GATTGTATGA ATTATAATTA ACAAATGGCA TTTTGTGTAA CATGCAAGAT 
TACCTTATTC CTAACATACT TAATATTAAT TGTTTAOOGT AAAACACATT GTAOGTTCTA 



900 
960 
1020 



Figure 2A 

SUBSTITUTE SHEET (RULE 28) 



WO 97/48275 PCT/US97/10M2 

3/18 



CTCTGTTCCA TCAGTTGCAA GATAAAAGGC AATATTTGTT TGACTTTTTT TCTACAAAAT 1080 
GAGACAAGGT AGTCAAOGTT CTATTTTCCG TTATAAACAA ACTGAAAAAA AGATGTTTTA 

GAATACCCAA ATATATGATA AGATAATGGG GTCAAAACTG TTAAGGG6TA AtGTAATAAT 1140 
CTTATGGGTT TATATACTAT TCTATTAOOC CAG TTT TGAC AATTCCCCAT TACATTATTA 

AGGGACTAAG TTTGCCCAGG AGCAGTGACC CATAACAACC AATCA6CAGG TATGATTTAC 1200 
TCCCTGATTC AAACGGGTCC TCGTCACTGG GTATTGTTGG TTAGTCGTCC ATACTAAATG 

TGGTCACCTG TTTAAAAGCA AACATCTTAT TGGTTGCTAT 6GGTTACTGC TTCTGGGCAA 1260 
ACCAGTGGAC AAATTTTCGT TTGTAGAATA ACCAACGATA CCCAATGACG AAGACCCGTT 

AATGTGTGCC TCATAGGGGG GTTAGTGTGT TGT6TACTGA ATAAATTGTA TTTATTTCAT 1320 
TTACACACGG AGTATCCCCC CAATCACACA ACACATGACT TATTTAACAT AAATAAAGTA 

TGTTACAAAA AAAAAAAA 
ACAATGTTTT TTTTTTTT 
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GAATTCCCTT TCACACAGGA CTCCTGGCAG AGGTGAATGG TTAGCCCTAT GGATTTGGTT 60 
CTTAAGGGAA AGTGT6TCCT GAGGAOOGTC TCCACTTACC AATCGGGATA CCTAAACCAA 

TGTTGATTTT GACACATGAT TGATTGCTTT CACAIAGGAT TGAAGGACTT GGATTTTTAT 120 
ACAACTAAAA CTGTGTACTA ACTAACGAAA GTCTRTOCTA ACTTCCTGAA OCTAAAAATA 

CTAATTCTGC ACTTTTAAAT TATCTGAGTA ATTGTTCATT TTGTATTGGA TGGGACTAAA 180 
GATTAAGACG TGAAAATTTA ATAGACTCAT TAACAAGTAA AACATAACCT ACCCTGATTT 

GATAAACTTA ACTCCTTGCT TTTGACTTGC OCATAAACTA TAAGGTGGGG TGAGTTGTAG 240 
CTATTTGAAT TGAGGAAOGA AAACTGAACG GGTATTTGAT ATTCCACCCC ACTCAACATC 

TTGCTTTTAC ATGTGC0CA6 ATTTTCCCTG TATTCCCTGT ATTCCCTCTA AAGTAAGCCT 300 
AACGAAAATG TACACGGGTC TAAAAGGGAC ATAAGGGACA TAAGGGAGAT TTCATTCGGA 

ACACATACAG GTTGGGCAGA ATAACAATGT CTCGAACAAG GAAAGTGGAC TCATTACTGC 360 
TGTGTATGTC CAACCCGTCT TATTGTTACA GAGCTTGTTC CTTTCACCTG AGTAATGACG 

TACTGGCCAT ACCTGGACTG GCGCTTCTCT TATTACCCAA TGCTTACTGT GCTTCGTGTG 420 
ATGACCGGTA TGGACCTGAC CGCGAAGAGA ATAATGGGTT ACGAATGACA CGAAGCACAC 

AGCCTGTGCG GATCCCCATG TGCAAATCTA TGCCATGGAA CATGACCAAG ATGCCCAACC 480 
TOGGACACGC CTAGGGGTAC ACGTTTAGAT ACGGTACCTT GTACTGGTTC TACGGGTTGG 

ATCTCCACCA CAGCACTCAA GCCAATGOCA TCCTGGCAAT TGAACAGTTT GAAGGTTTGC 540 
TAGAGGTGGT GTCGTGAGTT OGGTTAOGGT AGGAOCGTTA ACTTGTCAAA CTTCCAAACG 

TGACGACTGA ATGTAGCCAG GACCTTTTGT TCTTTCTGTG TGCCATGIAT GCCCCCATTT 600 
ACTGGTGACT TACATCGGTC CTGGAAAACA AGAAAGACAC ACGGTACATA CGGGGGTAAA 

GTACCATCGA TTTCCAGCAT GAACCAATTA AGCCTTGCAA GTC0GT6TGC GAAAGGGOCA 660 
CATGGTAGCT AAAGGTOGTA CTTGGTTAAT TOGGAAOGTT CAGGCACACG CTTTOCOGGT 

GGGCCGGCTG TGAGCCCATT CTCATAAAGT ACCGGCACAC TTGGCCAGAG AGCCTGGCAT 720 
COCGGCOGAC ACTOGGGTAA GAGTATTTCA TGGCCGTGTG AA00G6TCTC TCGGAGOGTA 

GTGAAGAGCT GCCOGTATAT GACAGAGGAG TCTGCATCTC CCCAGAGGCT ATCGTCACAG 780 
CnOOTCTOGAr - OGGQCMCftlA CTGTGTOCTO AGAC03AQAQ GGGHCtOCQA EAQCAQTGTO 

TGGAACAAGG AACAGATTCA ATGOCAGACT TCTCCATGGA TTCAAACAAT GGAAATTGCG 840 
AOCTTGTTCC TTGTCTAAGT TAOGGTCTGA AGAGGTAOCT AAGTTTGTTA CCTTTAAOGC 

GAAGOGGCAG GGAGCACTGT AAATGGAAGC CCATGAAGGC AACOCAAAAG ACGTATCTCA 900 
CTTCGCOGTC CCTCGTGACA TTTACGTTCG GGTACTTOCG TTGGGTTTTC TGCATAGAGT 

AGAATAATTA CAATTATGTA ATCAGAGCAA AAGTGAAAGA GGTGAAAGTG AAATGOCACG 960 
TCTTATTAAT GTTAATACAT TAGTCTOGTT TTCACTTTCT CCACTTTCAC TTTACGGTGC 

ACGCAACAGC AATTGTGGAA GTAAAGGAGA TTCTCAAGTC TTCOCTAGTG AACATTOCTA 1020 
TGOGTTGTCG TTAACACCTT CATTTCCTCT AAGAGTTCAG AAGGGATCAC TTGTAAGGAT 
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AAGACACAGT GACACTGTAC ACCAACTCAG GCTGCTTGTG CCCCCAGCTT GTTGCCAATG 1080 
TTCTGTGTCA CTGTGACATG TGGTTGAGTC CGACGAACAC GGGGGTCGAA CAACGGTTAC 

AGGAATACAT AATTATGGGC TATGAAGACA AAGAGCGTAC CAGGCTTCTA CTAGTGGAAG 1140 
TCCTTATGTA TTAATACCCG ATACTTCTGT TTCTCGCATG GTOOGAAGAT GATCAOCTTC 

GATCCTTGGC CGAAAAATGG AGAGATCGTC TTGCTAAGAA AGTCAAGCGC TGGGATCAAA 1200 
CTAGGAACCG GCTTTTTACC TCTCTAGCAG AACGATTCTT TCAGTTCGCG ACCCTAGTTT 

AGCTTOGACG TCCCAGGAAA AGCAAAGACC CCGTGGCTCC AATTCCCAAC AAAAACAGCA 1260 
TCGAAGCTGC AGGGTCCTTT TCGTTTCTGG GGCACCGAGG TTAAGGGTTG TTTTTGTCGT 

ATTCCAGACA AGCGCGTAGT TAGACTAACG GAAAGGTGTA TGGAAACTCT ATGGACTTTG 1320 
TAAGGTCTGT TCGCGCATCA ATCTGATTGC CTTTCCACAT ACCTTTGAGA TACCTGAAAC 

AAACTAAGAT TTGCATTGTT GGAAGAGCAA AAAAGAAATT GCACTACAGC AOGTTATATT 1380 
TTTGATTCTA AACGTAACAA CCTTCTCGTT TTTTCTTTAA CGTGATGTCG TGCAATATAA 

CTATTGTTTA CTACAAGAAG CTGGTTTAGT TGATTGTAGT TCTCCTTTCC TTCTTT TTT T 1440 
GATAACAAAT GATGTTCTTC GACCAAATCA ACTAACATCA AGAGGAAAGG AAGAAAAAAA 

TTATAACTAT ATTTGCACGT GTTCCCAGGC AATTGTTTTA TTCAACTTCC AGTGACAGAG 1500 
AATATTGATA TAAAOGTGCA CAAGGGTCCG TTAACAAAAT AAGTTGAAGG TCACTGTCTC 

CAGTGACTGA ATGTCTCAGC CTAAAGAAGC TCAATTCATT TCTGATCAAC TAATGGTGAC 1560 
GTCACTGACT TACAGAGTCG GATTTCTTCG AGTTAAGTAA AGACTAGTTG ATTACCACTG 

AAGTGTTTGA TACTTGGGGA AAGTGAACTA ATTGCAATGG TAAATCAGAG AAAAGTTGAC 1620 
TTCACAAACT ATGAACCCCT TTCACTTGAT TAACGTTAOC ATTTAGTCTC TTTTCAACTG 

CAATGTTGCT TTTCCTGTAG ATGAACAAGT GAGAGATCAC ATTTAAATGA TGATCACTTT 1680 
GTTACAAOGA AAAGGACATC TACTTGTTCA CTCTCTAGTG TAAATTTACT ACTAGTGAAA 

CCATTTAATA CTTTCAGCAG TTTTAGTTAG ATGACATGTA GGATGCACCT AAATCTAAAT 1740 
GGTAAATTAT GAAAGTCGTC AAAATCAATC TACTGTACAT CCTAOGTGGA TTTAGATTTA 

ATTTTATCAT AAATGAAGAG CTGGTTTAGA CTGTATGGTC ACTGTTGGGA AGGTAAATGC 1800 
TAAAATAGTA TTTACTTCTC GAOCAAATCT GACAIACCAG TGACAACOCT TCCATTTACG 

CTACTTTGTC AATTCTGTTT TAAAAATTGC CTAAATAAAT ATTAAGTCCT AAATAAAAAA 1860 
GATGAAACAG TTAAGACAAA ATTTTT A AOG GAJTTATTTA TAATTCAGGA TTTATTTTTT 
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MLLLFRAIPM LLLGUfVLQT DCEIAQYYID EEEPPGTVIA VLSQHSIFNT TDIPATNFRL 60 

MKQFNNSLIG VRESDGQLSX KERIDREQIC RQSLHCNLAL DWSFSKGHF KLLNVKVEVR 120 

DINDHSPHFP SEIMHVEVSE SSSVGTRIPL EIAIDEDVGS NSIQNFQISN NSHFSIDVLT 180 

RADGVKYADL VLKRELDREI QPTYIMEUA MDGGVPSLSG TAWNIKVLD FNDNSPVFER 240 

STIAVDLVED APLGYLLLEL HATDDDEGVN GEIVYGFSTL ASQEVRQLFK INSRTGSVTL 300 

EGQVDFETKQ TYEFEVQAQD LGPHPLTATC KVTVHILDVN DNTPAITITP LTTVNAGVAY 360 

IPETATKENF IALISTTORA SGSNGGfVRCT LYGHEHFKLQ QAYEDSYMTV TTSTLDRENI 420 

AAYSLTWAE DLGFPSLKTK KYYTVKVSDE NDNAPVFSKP QYEASILENN APGSYITTVI 460 

ARDSDSDQKG KVNYRLVDAK VMGQSLTTFV SLDADSGVLR AVRSLDYEKL KQLDFEIEAA S40 

DHGIPQLSTR VQLNLRIVDQ NDNCPVITNP LLNNGSGEVL LPISAPQNYL VFQLKAEDSD 600 

EGHNSQLFYT ILRDPSRLFA INKESGEVFL KKQLHSDHSE DLSIWAVYD LGRPSLSTHA 660 

TVKFILTOSF PSNVEWILQ PSAEEQHQID MSIIFIAVLA GGCAUJJAI FFVACTCKKK 720 

AGEFKQVPEQ HGTCNEERLL STPSPQSVSS SLSQSESCQL SIHTESENCS VSSHQEQBQQ 780 

TGIKHSISVP SYHTSGWHLD NCAMSISGHS HMGBISTKVQ WAKEIVTSMT VTLILVENQK 840 
RRALSSQCRH KPVLNTQMNQ QGSDMPITIS ATESTRVQKM GTAHCNMKRA IDCLTL 
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GAATTCCGAG AGATGAACTC CTTGAGATTG TTTTAAATGA CTGCAGGTCT GGAAGGATTC 60 
CTTAAGGGTC TCTACTTGAG GAACTCTAAC AAAATTTACT GACGTCCAGA CCTTCCTAAG 

ACATTGCCAC ACTGTTTCTA GGCATGAAAA AACTGCAAGT TTCAACTTTG TTTTTGGTGC 120 
TGTAAOGGTG TGACAAAGAT CCGTACTTTT TTGACGTTCA AAGTTGAAAC AAAAACCACG 

AACTTTGATT CTTCAAGATG CTGCTTCTCT TCAGAGCCAT TCCAATGCTG CTGTTGGGAC 180 
TTGAAACTAA GAAGTTCTAC GACGAAGAGA AGTCTCGGTA AGGTTACGAC GACAACCCTG 

TGATGGTTTT ACAAACAGAC TGTGAAATTG CCCAGTACTA CATAGATGAA GAAGAACCCC 240 
ACTACCAAAA TGTTTGTCTG ACACTTTAAC GGGTCATGAT GTATCTACTT CTTCTTGQGG 

CTGGCACTGT AATTGCAGTG TTGTCACAAC ACTCCATATT TAACACTACA GATATACCTG 300 
GACCGTGACA TTAACGTCAC AACAGTGTTG TGAGGTATAA ATTGTGATGT CTATATGGAC 

CAACCAATTT CCGTCTAATG AAGCAATTTA ATAATTCCCT TATCGGAGTC CGTGAGAGTG 360 
GTTGGTTAAA GGCAGATTAC TTCGTTAAAT TATTAAGGGA ATAGCCTCAG GCACTCTCAC 

ATGGGCAGCT GAGCATCATG GAGAGGATTG ACCGGGAGCA AATCTGCAGG CAGTCCCTTC 420 
TACCCGTOGA CTCGTAGTAC CTCTCCTAAC TGGCCCTOGT TTAGACGTCC GTCAGGGAAG 

ACTGCAACCT GGCTTTGGAT GTGGTCAGCT TTTCCAAAGG ACACTTCAAG CTTCTGAAOG 480 
TGACGTTGGA CCGAAACCTA CACCAGTCGA AAAGGTTTCC TGTGAAGTTC GAAGACTTGC 

TGAAAGTGGA GGTGAGAGAC ATTAATGACC ATAGCCCTCA CTTTCCCAGT GAAATAATGC 540 
ACTTTCAOCT CCACTCTCTG TAATTACTGG TATCGGGAGT GAAAGGGTCA CTTTATTACG 

ATGTGGAGGT GTCTGAAAGT TCCTCTGTGG GCACCAGGAT TCCTTTAGAA ATTGCAATAG 600 
TACACCTCCA CAGACTTTCA AGGAGACACC CGTGGTCCTA AGGAAATCTT TAACGTTATC 

ATGAAGATGT TGGGTCCAAC TCCATCCAGA ACTTTCAGAT CTCAAATAAT AGCCACTTCA 660 
TACTTCTACA ACCCAGGTTG AGGTAGGTCT TGAAAGTCTA GAGTTTATTA TCGGTGAAGT 

GCATTGATGT GCTAACCAGA GCAGATGGGG TGAAATATGC AGATTTAGTC TTAATGAGAG 720 
CGTAACTACA OGATTGGTCT CGTCTACCCC ACTTTAIACG TCTAAATCAG AATTACTCTC 

AACTGGACAG GGAAAXCCAG CCAACATACA TAATGGAGCT ACTAGCAATG GATGGGGGTG 780 
TTGAOCTGTC CCTTTAGGTC GGTTGTATGT ATTACCTCGA TGATCGTTAC CTACCCOCAC 



TACCATCACT ATCTGGTACT GCAGTGGTTA ACATCCGAGT CCTGGACTTT AATGATAACA 840 
ATGGTAGTGA TAGAOCATGA OGTCACCAAT TGTAGGCTCA GGACCTGAAA TTACTATTGT 

GCOCAGTGTT TGAGAGAAGC ACCATTGCTG TGGAOCTAGT AGAGGATGCT CCTCTGGGAT 900 
CGQGTCACAA ACTCTCTTCG TGGTAACGAC ACCTGGATCA TCTOCTACGA GGAGACCCTA 

ACCTTTTGTT GGAGTTACAT GCTACTGAOG ATGATGAAGG AGTGAATGGA GAAATTGTTT 960 
TGGAAAACAA CCTCAATGTA OGATGACTGC TACTACTTOC TCACTTACCT CTTTAACAAA 

ATGGATTCAG CACTTTGGCA TCTCAAGAGG TACGTCAGCT ATTTAAAATT AACTCCAGAA 1020 
TACCTAAGTC GTGAAAGCGT AGAGTTCTCC ATGCAGTCGA TAAATTTTAA TTGAGGTCTT 
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CTGGCAGTGT TACTCTTGAA GOCCAAGTTG ATTTTGAGAC CAAGCAGACT TACGAATTTG 1080 
GACCGTCACA ATGAGAACTT CCGGTTCAAC TAAAACTCTG GTTOGTCTGA ATGCTTAAAC 

AGGTACAAGC CCAAGATTTG GGOCCCAAOC CACTGACTGC TACTTGTAAA GTAACTGTTC 1140 
T0CATGTTC6 GGTTCTAAAC OCGGGGTTGG GTGACTGACG ATGAACATTT CATTGACAAG 

ATATACTTGA TGTAAATGAT AATACCCCAG CCATCACTAT TACCCCTCTG ACTACTGTAA 1200 
TATATGAACT ACATTTACTA TTATGGGGTC GGTAGTGATA ATGGGGAGAC TGATGACATT 

ATGCAGGAGT TGCCTATATT CCAGAAACAG CCACAAAGGA GAACTTTATA GCTCTGATCA 1260 
TACGTCCTCA ACGGATATAA GGTCTTTGTC GGTGTTTCCT CTTGAAATAT CGAGACTAGT 

GCACTACTGA CAGAGCCTCT GGATCTAATG GACAAGTTCG CTGTACTCTT TATGGACATG 1320 
CGTGATGACT GTCTCGGAGA CCTAGATTAC CTGTTCAAGC GACATGAGAA ATACCTGTAC 

AGCACTTTAA ACTACAGCAA GCTTATGAGG ACAGTTACAT GATAGTTACC ACCTCTACTT 1380 
TCGTGAAATT TGATGTCGTT CGAATACTCC TGTCAATGTA CTATCAATGG TGGAGATGAA 

TAGACAGGGA AAACATAGCA GCGTACTCTT TGACAGTAGT TGCAGAAGAC CTTGGCTTCC 1440 
ATCTGTCCCT TTTGTATCGT CGCATGAGAA ACTGTCATCA ACGTCTTCTG GAACCGAAGG 

CCTCATTGAA GACCAAAAAG TACTACACAG TCAAGGTTAG TGATGAGAAT GACAATGCAC 1500 
GGAGTAACTT CTGGTTTTTC ATGATGTGTC AGTTCCAATC ACTACTCTTA CTGTTACGTG 

CTGTATTTTC TAAACCCCAG TATGAAGCTT CTATTCTGGA AAATAATGCT CCAGGCTCTT 1560 
GACATAAAAG ATTTGGGGTC ATACTTCGAA GATAAGACCT TTTATTACGA GGTCCGAGAA 

ATATAACTAC AGTGATAGCC AGAGACTCTG ATAGTGATCA AAATGGCAAA GTAAATTACA 1620 
TATATTGATG TCACTATCGG TCTCTGAGAC TATCACTAGT TTTACCGTTT CATTTAATGT 

GACTTGTGGA TGCAAAAGTG ATGGGCCAGT CACTAACAAC ATTTGTTTCT CTTGATGCGG 1680 
CTGAACACCT ACGTTTTCAC TACCCGGTCA GTGATTGTTG TAAACAAAGA GAACTACGCC 

ACTCTGGAGT ATTGAGAGCT GTTAGGTCTT TAGACTATGA AAAACTTAAA CAACTGGATT 1740 
TGAGACCTCA TAACTCTCGA CAATCCAGAA ATCTGATACT TTTTGAATTT GTTGACCTAA 

TTGAAATTGA AGCTGCAGAC AATGGGATCC CTCAACTCTC CACTCGCGTT CAACTAAATC 1800 
AACTTTAACT TCGACGTCTG TTACCCTAGG GAGTTGAGAG GTGAGOGCAA GTTGATTTAG 

TCAGAATAGT TGATCAAAAT GATAATTGOC CTGTGATAAC TAATCCTCTT CTTAAXAATG 1860 
AGTCTTATCA ACTAGTTTTA CTATT A *ras iarw^ p^ny**?™ r.^R^r 

GCTCGGGTGA AGTTCTGCTT OOCATCAGOG CTCCTCAAAA CTATTTAGTT TTOCAGCTCA 1920 
CGAGCCCACT TCAAGACGAA GGGTAGTCGC GAGGAGTTTT GATAAATCAA AAGGTOGAGT 

AAGCCGAGGA TTCAGATGAA GGGCACAACT CCCAGCTGTT CTATACCATA CTGAGAGATC 1980 
TTOGGCTCCT AAGTCTACTT CCCGTGTTGA GGGTOGACAA GAXATGGTAT GACTCTCTAG 

CAAGCAGATT GTTTGOCATT AACAAAGAAA GTGGTGAAGT GTTCCTGAAA AAACAATTAA 2040 
GTTCGTCTAA CAAACGGTAA TTGTTTCTTT CACCACTTCA CAAGGACTTT TTTGTTAATT 

ACTCTGACCA TTCAGAGGAC TTGAGCATAG TAGTTGCAGT GTATGACTTG GGAAGACCTT 2100 
TGAGACTGGT AAGTCTCCTG AACTCGTATC ATCAACGTCA CATACTGAAC CCTTCTGGAA 

CATTATOCAC CAATGCTACA GTTAAATTCA TCCTCACCGA CTCTTTTCCT TCTAACGTTG 2160 
GTAATAGGTG GTTACGATGT CAATTTAAGT AGGAGTGGCT GAGAAAAGGA AGATTGCAAC 
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AAGTCGTTAT TTTGCAACCA TCTGCAGAAG AGCAGCACCA GATCGATATG TCCATTATAT 2220 
TTCAGCAATA AAACGTTGGT AGACGTCTTC TCGTCGTGGT CTAGCTATAC AGGTAAIATA 

TCATTGCAGT GCTGGCTGGT GGTTGTGCTT TGCTACTTTT GGCCATCTTT TTTGTGGCCT 2200 
AGTAACGTCA CGACCGACCA CCAACACGAA ACGATGAAAA CCGGTAGAAA AAACACCGGA 

GTACTTGTAA AAAGAAAGCT GGTGAATTTA AGCAGGTACC TGAACAACAC GGAACATGCA 2340 
CATGAACATT TTTCTTTCGA CCACTTAAAT TCGTCCAYGG ACTTGTTGTG CCTTGTACGT 

ATGAAGAACG CCTGTTAAGC ACCCCATCTC C0CA6T0GGT CTCTTCTTCT TTGTCTCAGT 2400 
TACTTCTTGC GGACAATTCG TGGGGTAGAG GGGTCAGCCA GAGAAGAAGA AACAGAGTCA 

CTGAGTCATG CCAACTCTCC ATCAATACTG AATCTGAGAA TTGCAGOGTG TCCTCTAACC 2460 
GACTCAGTAC GGTTGAGAGG TAGTTATGAC TTAGACTCTT AACGTCGCAC AGGAGATTGG 

AAGAGCAGCA TCAGCAAACA GGCATAAAGC ACTCCATCTC TGTACCATCT TATCACACAT 2520 
TTCTCGTOGT AGTCGTTTGT CCGTATTTCG TGAGGTAGAG ACATGGTAGA ATAGTGTGTA 

CTGGTTGGCA CCTGGACAAT TGTGCAAXGA GCATAAGTGG ACATTCTCAC ATGGGGCACA 2580 
GACCAACCGT GGACCTGTTA ACAOGTTACT CGTATTCACC TGTAAGAGTG TACCCCGTGT 

TTAGTACAAA GGTACAGTGG GCAAAGGAGA TAGTGACTTC AATGACAGTG ACTCTGATAC 2640 
AATCATGTTT CCATGTCACC CGTTTCCTCT ATCACTGAAG TTACTGTCAC TGAGACTATG 

TAGTGGAGAA TCAGAAAAGA AGAGGATTGA GCAGCCAATG CAGGCACAAG OCAGTGCTCA 2700 
ATCACCTCTT AGTCTTTTCT TCTCGTAACT CGTCGGTTAC GTCCGTGTTC GGTCACGAGT 

ATACACAGAT GAATCAGCAG GGTTOOGACA TGCCGATAAC TATTTCAGCC ACCGAATCAA 2760 
TATGTGTCTA CTTAGTCGTC CCAAGGCTGT ACGGCTATTG ATAAAGTCGG TGGCTTAGTT 

CAAGGGTCCA GAAAATGGGA ACTGCACATT GCAATATGAA AAGGGCTATA GACTGTCTTA 2B20 
GTTCCCAGGT CTTTTACCCT TGACGTGTAA OGTTATACTT TTCCCGATAT CTGACAGAAT 

CTCTGTAGCT CCTGTATATT ACAATACCTA CCATGCAAGA ATGOCTAACC TGCACATACC 2660 
GAGACATCGA GGACATATAA TGTTATGGAT GGTAOGTTCT TACGGATTGG ACGTGXATGG 

GAAOCATAOC CTTAGAGAOC CTTATTACCA TATCAATAAT CCTGTTGCTA ATCGGATGCA 2940 
CTTGGTATGG GAATCTCTGG GAATAATGGT ATAGTTATTA GGACAACGAT tAGCCTAOGT 

GGOGGAATAT GAAAGAGATT TAGtCAACAG AAGTGCAACG TTATCTCCGC AGAGATOGTC 3000 
COOOGIgATA Gt ' IPGPCTAA ATCAGVitWC tfCAOQTTGC AftTAGAQG QQ TCTCtAQCAG 

TAGCAGATAC CAAGAATTCA ATTACAGTCC GCAGATATCA AGACAGCTTC ATOCTTCAGA 3060 
ATCGTCTATG GTTCTTAAGT TAATGTCAGG CGTCTATAGT TCTGTCGAAG TAGGAAGTCT 

AATTGCTACA ACCTTTTAAT CATTAGGCAT GCAAGTGAGA ATGGACAAAG GCAAGTGCTT 3120 
TTAAOGATGT TGGAAAATTA GTAATCOGTA OGTTCACTCT TACGTGTTTC CGTTCAOGAA 

TAGCATGAAA GCTAAATATA TGGAGTCTCC CCTTTCCCTC TGATGGATGG GGGGAGACAC 3180 
ATOGTACTTT CGATTTATAT ACCTCAGAGG GGAAAGGGA6 ACTAOCTACC OCOCTCTGTG 

AGGACAGTGC ATAAATATAC AGCTGCTTTC TATTTGCATT TCACTTGGGA ATTTTTTGTT 3240 
TOCTGTCACG TATTTATATG TCGACGAAAG ATAAACGTAA AGTGAACCCT TAAAAAACAA 

T TTTT TACAT ATTTATTTTT CCTGAATTGA ATGTGACATT GTCCTGTCAC CTAACTAGCA 3300 
AAAAAATGTA TAAATAAAAA GGACTTAACT TACACTGTAA CAGGACAGTG GATTGATCGT 



Figure 6C 

SUBSTITUTE SHEET (RULE 26) 



WO 97/48275 



il 1 / 18 



PCT/US97/10942 



XTTAAATCCA CAGACCIACA GTCAAATATT TGAGGGCCCC TGAAACAGCA CATCAGTCAG 3360 
TAATTTAGGT GTCTGGATGT CAGTTTATAA ACTCCCGGGG ACTTTGTCGT GTAGTCAGTC 

GACCTAAAGT GGCCTTTTTA CTTTTAGCAG CTOCTGGGTC TGCCCTCTGT GTTAAXCAGC 3420 
CTGGATTTCA CCGGAAAAAT GAAAATCGTC GAGGACCCAG ACGGGAGACA CAATTAGTCG 

CCCTGGTCAA GTCCTGAGTA GGATCATGGC GTTl 'TTATAT GCATCTCACC TACTTTGGAC 3480 
GGGACCAGTT CAGGACTCAT CCTAGTACCG CAAAAATATA CGTAGAGTGG ATGAAACCTG 

GTGATTTACA CATAATAGGA AACGCTTGGT TTCAGTGAAG TCTGTGTTGT ATATATTCTG 3540 
CACTAAATGT GTATTATCCT TTGCGAACCA AAGTCACTTC AGACACAACA TATATAAGAC 

TTATATACAC GCATTTTGTG TTTGTGTATA TATTTCAAGT CCATTCAGAT ATGTGTATAT 3600 
AATATATGTG CGTAAAACAC AAACACATAT ATAAAGTTCA GGTAAGTCTA TACACATATA 

AGTGCAGACC TTGTAAATTA AATATTCTGA TACTTTTTCC TCAAXAAATA TTTAAAT 
TCACGTCTGG AACATTTAAT TTATAAGACT ATGAAAAAGG AGTTATTTAT AAATTTA 



Figure 6D 

SUBSTITUTE SHEET (RULE 26) 



WO 97/48275 PCTWS97/10942 

12/18 



HVCCGPGRML IiGWAGLLVLA ALCLLQVPGA QAAACEFVRI PLCKSLPWNM TKMPNHLHHS 60 

TQANAILAME QFEGLLGTHC SPDLLFFLCA MYAPICTIDF QHEPUCPCKS VCERARQGCE 120 

PILIKYRHSW PESLACDELP VYDRGVCISP EAIVTADGAD FPMDSSTGHC RGASSERCKC 180 

KFVRATQKTY FRNNYNYVTR AKVKEVKMKC HDVTAWEVK EILKASLVNI PRDTVNLYTT 240 

SGCLCPPLTV NEEYVIMGYE DEERSRLLLV EGSIAEKWKD RLGKKVKRWD MKLRHLGLGK 300 
TDASDSTQNQ KSGRNSNPRP ARS. 
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AAGCCTGGGA CCATGGTCTG CTGCGGCCCG GGACGGATGC TGCTAGGATG GGCCGGGTTG 60 
TTCGGACCCT GGTACCAGAC GACGCCGGGC CCTGCCTACG ACGATCCTAC CCGGCCCAAC 

CTAGTCCTGG CTGCTCTCTG CCTGCTCCAG GTGCCCGGAG CTCAGGCTGC AGCCTGTGAG 120 
GATCAGGACC GACGAGAGAC GGACGAGGTC CACGGGCCTC GAGTCCGACG TCGGACACTC 

CCTGTCCGCA TCCCGCTGTG CAAGTCCCTT CCCTGGAACA TGACCAAGAT GCCCAACCAC 180 
GGACAGGCGT AGGGCGACAC GTTCAGGGAA GGGACCTTGT ACTGGTTCTA CGGGTTGGTG 

CTGCACCACA GCACCCAGGC TAACGCCATC CTGGCCATGG AACAGTTCGA AGGGCTGCTG 240 
GACGTGGTGT CGTGGGTCCG ATTGCGGTAG GACCGGTACC TTGTCAAGCT TCCCGACGAC 

GGCACCCACT GCAGCCCGGA TCTTCTCTTC TTCCTCTGTG CAATGTACGC ACCCATTTGC 300 
CCGTGGGTGA CGTCGGGCCT AGAAGAGAAG AAGGAGACAC GTTACATGCG TGGGTAAACG 

ACCATCGACT TCCAGCACGA GCCCATCAAG CCCTGCAAGT CTGTGTGTGA GCGCGCCCGA 360 
TGGTAGCTGA AGGTCGTGCT CGGGTAGTTC GGGACGTTCA GACACACACT CGCGCGGGCT 

CAGGGCTGCG AGCCCATTCT CATCAAGTAC CGCCACTCGT GGCCGGAAAG CTTGGCCTGC 420 
GTCCCGACGC TCGGGTAAGA GTAGTTCATG GCGGTGAGCA CCGGCCTTTC GAACCGGACG 

GACGAGCTGC CGGTGTACGA CCGCGGCGTG TGCATCTCTC CTGAGGCCAT CGTCACCGCG 480 
CTGCTCGACG GCCACATGCT GGCGCOGCAC ACGTAGAGAG GACTCCGGTA GCAGTGGCGC 

GACGGAGCGG ATTTTCCTAT GGATTCAAGT ACTGGACACT GCAGAGGGGC AAGCAGCGAA 540 
CTGCCTCGCC TAAAAGGATA CCTAAGTTCA TGACCTGTGA CGTCTCCCCG TTCGTOGCTT 

CGTTGCAAAT GTAAGCCTGT CAGAGCTACA CAGAAGACCT ATTTCCGGAA CAATTACAAC 600 
GCAACGTTTA CATTCGGACA GTCTOGATGT GTCTTCTGGA TAAAGGCCTT GTTAATGTTG 

TATGTCATCC GGGCTAAAGT TAAAGAGGTA AAGATGAAAT GTCATGATGT GACOGCOGTT 660 
ATACAGTAGG CCCGATTTCA ATTTCTCCAT TTCTACTTTA CAGTACTACA CTGGCGGCAA 

GTGGAAGTGA AGGAAATTCT AAAGGCATCA CTGGTAAACA TTCCAAGGGA CACCGTCAAT 720 
CACCTTCACT TCCTTTAAGA TTTCCGTAGT GACCATTTGT AAGGTTCCCT GTGGCAGTTA 

CTTTATACCA CCTCTGGCTG CCTCTGTCCT CCACTTACTG TCAATGAGGA ATATGTCATC 780 
GAAATATGGT GGAGACCGAC GGAGACAGGA GGTGAATGAC AGTTACTCCT TATACAGTAG 



ATGGGCTATG AAGACGAGGA ACGTTCCAGG TTACTCTTGG TAGAAGGCTC TATAGCTGAG 840 
TACCCGATAC TTCTGCTCCT " TGCAAGGTCC AATGAGAACC ATCTTCCGAG ATATCGACTC 

AAGTGGAAGG ATCGGCTTGG TAAGAAAGTC AAGCGCTGGG ATATGAAACT CCGACACCTT 900 
TTCACCTTCC TAGCCGAACC ATTCTTTCAG TTCGCGACCC TATACTTTGA GGCTGTGGAA 

GGACTGGGTA AAACTGATGC TAGCGATTCC ACTCAGAATC AGAAGTCTGG CAGGAACTCT 960 
CCTGACCCAT TTTGACTACG ATCGCTAAGG TGAGTCTTAG TCTTCAGACC GTCCTTGAGA 



Figure 8A 

SUBSTITUTE SHE£HRUL£ 2^) 



WO 97/48275 



14/18 



PCT/US97/10942 



AATCCCCGGC CAGCACGCAG CTAAATCCTG AAATGTAAAA GGCCACACCC ACGGACTCCC 1020 
TTAGGGGCCG GTCGTGCGTC GATTTAGGAC TTTACATTTT CCGGTGTGGG TGCCTGAGGG 

TTCTAAGACT GGCGCTGGTG GACTAACAAA GGAAAACCGC ACAGTTGTGC TCGTGACCGA 1080 
AAGATTCTGA CCGCGACCAC CTGATTGTTT CCTTTTGGCG TGTCAACACG AGCACTGGCT 

TTGTTTACCG CAGACACCGC GTGGCTACCG AAGTTACTTC CGGTCCCCTT TCTCCTGCTT 1140 
AACAAATGGC GTCTGTGGCG CACCGATGGC TTCAATGAAG GCCAGGGGAA AGAGGACGAA 

CTTAATGGCG TGGGGTTAGA TCCTTTAATA TGTTATATAT TCTGTTTCAT CAATCACGTG 1200 
GAATTACCGC ACCCCAATCT AGGAAATTAT ACAATATATA AGACAAAGTA GTTAGTGCAC 

GGGACTGTTC TTTTGCAACC AGAATAGTAA ATTAAATATG TTGATGCTAA GGTTTCTGTA 1260 
CCCTGACAAG AAAACGTTGG TCTTATCATT TAATTTATAC AACTACGATT CCAAAGACAT 

CTGGACTCCC TGGGTTTAAT TTGGTGTTCT GTACCCTGAT TGAGAATGCA ATGTTTCATG 1320 
GACCTGAGGG ACCCAAATTA AACCACAAGA CATGGGACTA ACTCTTACGT TACAAAGTAC 

TAAAGAGAGA ATCCTGGTCA TATCTCAAGA ACTAGATATT GCTGTAAGAC AGCCTCTGCT 1380 
ATTTCTCTCT TAGGACCAGT ATAGAGTTCT TGATCTATAA CGACATTCTG TCGGAGACGA 

GCTGOGCTTA TAGTCTTGTG TTTGTATGCC TTTGTCCATT TCCCTCATGC TGTGAAAGTT 1440 
CGACGCGAAT ATCAGAACAC AAACATACGG AAACAGGTAA AGGGAGTACG ACACTTTCAA 

ATACATGTTT ATAAAGGTAG AACGGCATTT TGAAATCAGA CACTGCACAA GCAGAGTAGC 1500 
TATGTACAAA TATTTCCATC TTGCCGTAAA ACTTTAGTCT GTGACGTGTT CGTCTCATCG 

CCAACACCAG GAAGCATTTA TGAGGAAACG CCACACAGCA TGACTTATTT TCAAGATTGG 1560 
GGTTGTGGTC CTTCGTAAAT ACTCCTTTGC GGTGTGTCGT ACTGAATAAA AGTTCTAACC 

CAGGCAGCAA AATAAATAGT GTTGGGAGCC AAGAAAAGAA TATTTTGCCT GGTTAAGGGG 1620 
GTCCGTOGTT TTATTTATCA CAACCCTCGG TTCTTTTCTT ATAAAACGGA CCAATTCCCC 

CACACTGGAA TCAGTAGCCC TTGAGCCATT AACAGCAGTG TTCTTCTGGC AAGTTTTTGA 1680 
GTGTGACCTT AGTCATCGGG AACTOGGTAA TTGTCGTCAC AAGAAGACCG TTCAAAAACT 



TTTGTTCATA AATGTATTCA CGAGCATTAG AGATGAACTT ATAACTAGAC ATCTGTTGTT 1740 
AAACAAGTAT TTACATAAGT GCTCGTAATC TCTACTTGAA TATTGATCTG TAGACAACAA 

ATCTCTATAG CTCTGCTTCC TTCTAAATCA AACCCATTGT TGGATGCTCC CTCTCCATTC 1800 
TAGAGATATC GAGACGAAGG AAGATTTAGT TTGGGTAACA ACCTACGAGG GAGAGGTAAG 
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ATAAATAAAT TTGGCTTGCT GTATTGGCCA GGAAAAGAAA GTATTAAAGT ATGCATGCAT I860 
TATTTATTTA AACCGAACGA CATAACCGGT CCTTTTCTTT CATAATTTCA TACGTACGTA 

GTGCACCAGG GTGTTATTTA ACAGAGGTAT GTAACTCTAT AAAAGACTAT AATTTACAGG 1920 
CACGTGGTCC CACAATAAAT TGTCTCCATA CATTGAGATA TTTTCTGATA TTAAATGTCC 

ACACGGAAAT GTGCACATTT GTTTACTTTT TTTCTTCCTT TTGCTTTGGG CTTGTGATTT 1980 
TGTGCCTTTA CACGTGTAAA CAAATGAAAA AAAGAAGGAA AACGAAACCC GAACACTAAA 

TGGTTTTTGG TGTGTTTATG TCTGTATTTT GGGGGGTGGG TAGGTTTAAG CCATTGCACA 2040 
ACCAAAAACC ACACAAATAC AGACATAAAA CCCCCCACCC ATCCAAATTC GGTAACGTGT 

TTCAAGTTGA ACTAGATTAG AGTAGACTAG GCTCATTGGC CTAGACATTA TGATTTGAAT 2100 
AAGTTCAACT TGATCTAATC TCATCTGATC CGAGTAACCG GATCTGTAAT ACTAAACTTA 

TTGTGTTGTT TAATGCTCCA TCAAGATGTC TAATAAAAGG AATATGGTTG TCAACAGAGA 2160 
AACACAACAA ATTACGAGGT AGTTCTACAG ATTATTTTCC TTATACCAAC AGTTGTCTCT 

CGACAACAAC AACAAA 
GCTGTTGTTG TTGTTT 
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HVCGSFGGHL LLRAGLLAIA ALCLLRVPGA RAAACEFVRI PLCKSLPWNM TKMPNHLHHS 60 

TQANAILAIE QFEGLLGTHC SPDLLFFLCA MYAPICTIDF QHEPIKPCKS VCERARQGCE 120 

PILIKYRHSW PENLACEELP VYDRGVCISP EAIVTADGAD FPMDSSNGNC RGASSERCKC 180 

KPIRATQKTY FRNNYNYVTR AKVKEIKTKC HDVTAWEVK EILKSSLVNI PRDTVNIiYTS 240 

SGCLCPPLNV NEEYIIMGYE DEERSRLLLV EGSIAEKWKD RLGKKVKRWD MKLRHLGLSK 300 
SDSSNSDSTQ SQKSGRNSNP RQARN. 
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GGCGGAGCGG GCCTTTTGGC GTCCACTGCG CGGCTGCACC CTGCCCCATC TGCCGGGATC 60 
CCGCCTCGCC CGGAAAACCG CAGGTGACGC GCCGACGTGG GACGGGGTAG ACGGCCCTAG 

ATGGTCTGCG GCAGCCCGGG AGGGATGCTG CTGCTGCGGG CCGGGCTGCT TGCCCTGGCT 120 
TACCAGACGC CGTCGGGCCC TCCCTACGAC GACGACGCCC GGCCCGACGA ACGGGACCGA 

GCTCTCTGCC TGCTCCGGGT GCCCGGGGCT CGGGCTGCAG CCTGTGAGCC CGTCCGCATC 180 
CGAGAGACGG ACGAGGCCCA CGGGCCCCGA GCCCGACGTC GGACACTCGG GCAGGCGTAG 

CCCCTGTGCA AGTCCCTGCC CTGGAACATG ACTAAGATGC CCAACCACCT GCACCACAGC 240 
GGGGACACGT TCAGGGACGG GACCTTGTAC TGATTCTACG GGTTGGTGGA CGTGGTGTCG 

ACTCAGGCCA ACGCCATCCT GGCCATCGAG CAGTTCGAAG GTCTGCTGGG CACCCACTGC 300 
TGAGTCCGGT TGCGGTAGGA CCGGTAGCTC GTCAAGCTTC CAGACGACCC GTGGGTGACG 

AGCCCCGATC TGCTCTTCTT CCTCTGTGCC ATGTAOGCGC CCATCTGCAC CATTGACTTC 360 
TCGGGGCTAG ACGAGAAGAA GGAGACACGG TACATGCGCG GGTAGACGTG GTAACTGAAG 

CAGCACGAGC CCATCAAGCC CTGTAAGTCT GTGTGCGAGC GGGCCCGGCA GGGCTGTGAG 420 
GTCGTGCTCG GGTAGTTCGG GACATTCAGA CACACGCTCG CCCGGGCOGT CCCGACACTC 

CCCATACTCA TCAAGTACCG CCACTCGTGG CCGGAGAACC TGGCCTGOGA GGAGCTGCCA 480 
GGGTATGAGT AGTTCATGGC GGTGAGCACC GGCCTCTTGG ACCGGACGCT CCTCGACGGT 

GTGTACGACA GGGGCGTGTG CATCTCTCCC GAGGCCATCG TTACTGCGGA CGGAGCTGAT 540 
CACATGCTGT CCCOGCACAC GTAGAGAGGG CTCCGGTAGC AATGACGCCT GCCTCGACTA 

TTTCCTATGG ATTCTAGTAA CGGAAACTGT AGAGGGGCAA GCAGTGAACG CTGTAAATGT 600 
AAAGGATACC TAAGATCATT GCCTTTGACA TCTCCCCGTT CGTCACTTGC GACATTTACA 

AAGCCTATTA GAGCTACACA GAAGACCTAT TTCCGGAACA ATTACAACTA TGTCATTCGG 660 
TTCGGATAAT CTCGATGTGT CTTCTGGATA AAGGCCTTGT TAATGTTGAT ACAGTAAGCC 

GCTAAAGTTA AAGAGATAAA GACTAAGTGC CATGATGTGA CTGCAGTAGT GGAGGTGAAG 720 
CGATTTCAAT TTCTCTATTT CTGATTCACG GTACTACACT GACGTCATCA CCTCCACTTC 



GAGATTCTAA AGTCCTCTCT GGTAAACATT CCACGGGACA CTGTCAACCT CTATACCAGC 780 
CTCTAAGATT TCAGGAGAGA CCATTTGTAA GGTGCCCTGT GACAGTTGGA GATATGGTCG 

TCTGGCTGCC TCTGCCCTCC ACTTAATGTT AATGAGGAAT ATATCATCAT GGGCTATGAA 840 
AGACCGACGG AGACGGGAGG TGAATTACAA TTACTCCTTA TATAGTAGTA CCCGATACTT 
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GATGAGGAAC GTTCCAGATT ACTCTTGGTG GAAGGCTCTA TAGCTGAGAA GTGGAAGGAT 900 
CTACTCCTTG CAAGGTCTAA TGAGAACCAC CTTCCGAGAT ATCGACTCTT CACCTTCCTA 

CGACTCGGTA AAAAAGTTAA GCGCTGGGAT ATGAAGCTTC GTCATCTTGG ACTCAGTAAA 960 
GCTGAGCCAT TTTTTCAATT CGCGACCCTA TACTTCGAAG CAGTAGAACC TGAGTCATTT 

AGTGATTCTA GCAATAGTGA TTCCACTCAG AGTCAGAAGT CTGGCAGGAA CTCGAACCCC 1020 
TCACTAAGAT CGTTATCACT AAGGTGAGTC TCAGTCTTCA GACCGTCCTT GAGCTTGGGG 

CGGCAAGCAC GCAACTAAAT CCCGAAATAC AAAAAGTAAC ACAGTGGACT TCCTATTAAG 1080 
GCCGTTCGTG CGTTGATTTA GGGCTTTATG TTTTTCATTG TGTCACCTGA AGGATAATTC 

ACTTACTTGC ATTGCTGGAC TAGCAAAGGA AAATTGCACT ATTGCACATC ATATTCTATT 1140 
TGAATGAACG TAACGACCTG ATCGTTTCCT TTTAAOGTGA TAACGTGTAG TATAAGATAA 

GTTTACTATA AAAATCATGT GATAACTGAT TATTACTTCT GTTTCTCTTT TGGTTTCTCC 1200 
CAAATGATAT TTTTAGTACA CTATTGACTA ATAATGAAGA CAAAGAGAAA ACCAAAGACG 

TTCTCTCTTC TCTCAACCCC TTTGTAATGG TTTGGGGGCA GACTCTTAAG TATATTGTGA 1260 
AAGAGAGAAG AGAGTTGGGG AAACATTACC AAACCCCGGT CTGAGAATTC ATATAACACT 

GTTTTCTATT TCACTAATCA TGAGAAAAAC TGTTCTTTTG CAATAATAAT AAATTAAACA 1320 
CAAAAGATAA AGTGATTAGT ACTCTTTTTG ACAAGAAAAC GTTATTATTA TTTAATTTGT 

TGCTGTTACC AGAGCCTCTT TGCTGAGTCT CCAGATGTTA ATTTACTTTC TGCACCCCAA 1380 
ACGACAATGG TCTCGGAGAA ACGACTCAGA GGTCTACAAT TAAATGAAAG ACGTGGGGTT 

TTGGGAATGC AATATTGGAT GAAAAGAGAG GTTTCTGGTA TTCACAGAAA GCTAGATATG 1440 
AACCCTTACG TTATAACCTA CTTTTCTCTC CAAAGACCAT AAGTGTCTTT CGATCTATAC 

CCTTAAAACA TACTCTGCCG ATCTAATTAC AGCCTTATTT TTGTATGCCT TTTGGGCATT 1500 
GGAATTTTGT ATGAGACGGC TAGATTAATG TCGGAATAAA AACATACGGA AAACCCGTAA 

CTCCTCATGC TTAGAAAGTT CCAAATGTTT ATAAAGGTAA AATGGCAGTT TGAAGTCAAA 1560 
GAGGAGTACG AATCTTTCAA GGTTTACAAA TATTTCCATT TTACCGTCAA ACTTCAGTTT 

TGTCACAT AG GCAAAGCAAT CAAGCACCAG GAAGTGTTTA TGAGGAAACA ACACCCAAGA 1620 
ACAGTGDATC CGTTTOGOTA GTTCQTOGTC"CTT6ft6AAAT AGTOG U TOtfi 1 I'OT GXSGTT'g'g 

TGAATTATTT TTGAGACTGT CAGGAAGTAA AATAAATAGG AGCTTAAGAA AGAACATTTT 1680 
ACTTAATAAA AACTCTGACA GTCCTTCATT TTATTTATCC TCGAATTCTT TCTTGTAAAA 

GCCTGATTGA GAAGCACAAC TGAAACCAGT AGCCGCTGGG GTGTTAATGG TAGCATTCTT 1740 
CGGACTAACT CTTCGTGTTG ACTTTGGTCA TCGGCGACCC CACAATTACC ATCGTAAGAA 

CTTTTGGCAA TACATTTGAT TTGTTCATGA ATATATTAAT CAGCATTAGA GAAATGAATT 1800 
GAAAACCGTT ATGTAAACTA AACAAGTACT TATATAATTA GTCGTAATCT CTTTACTTAA 

ATAACTAGAC ATCTGCTGTT ATCACCATAG TTTTGTTTAA TTTGCTTCCT TTTAAATAAA 1860 
TATTGATCTG TAGACGACAA TAGTGGTATC AAAACAAATT AAACGAAGGA AAATTTATTT 

CCCATTGGTG AAAGTCAAAA AAAAAAAAAA AAA 
GGGTAACCAC TTTCAGTTTT TTTTTT l ' m * TTT 
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